Quartet MaxCut: a fast algorithm for amalgamating quartet trees.
نویسندگان
چکیده
Accurate phylogenetic reconstruction methods are inherently computationally heavy and therefore are limited to relatively small numbers of taxa. Supertree construction is the task of amalgamating small trees over partial sets into a big tree over the complete taxa set. The need for fast and accurate supertree methods has become crucial due to the enormous number of new genomic sequences generated by modern technology and the desire to use them for classification purposes. In particular, the Assembling the Tree of Life (ATOL) program aims at constructing the evolutionary history of all living organisms on Earth. When dealing with unrooted trees, a quartet - an unrooted tree over four taxa - is the most basic piece of phylogenetic information. Therefore, quartet amalgamation stands at the heart of any supertree problem as it concerns combining many minimal pieces of information into a single, coherent, and more comprehensive piece of information. We have devised an extremely fast algorithm for quartet amalgamation and implemented it in a very efficient code. The new code can handle over a hundred millions of quartet trees over several hundreds of taxa with very high accuracy.
منابع مشابه
Weighted quartets phylogenetics.
Despite impressive technical and theoretical developments, reconstruction of phylogenetic trees for enormous quantities of molecular data is still a challenging task. A key tool in analyses of large data sets has been the construction of separate trees for subsets (e.g., quartets) of sequences, and subsequent combination of these subtrees into a single tree for the full set (i.e., supertree ana...
متن کاملDeveloping Scalable Quartet Tree Encodings
Reconstructing the Tree of Life, the evolutionary history of all species, stands as one of the most significant and intensive problems in computational biology. One approach to this grand project is to use supertree methods that merge a set of smaller trees (or source trees) into one single tree. In practice, most biologists use a particular supertree method called Matrix Representation with Pa...
متن کاملMulti-SpaM: a Maximum-Likelihood approach to Phylogeny reconstruction based on Multiple Spaced-Word Matches
Word-based or ‘alignment-free’ methods for phylogeny reconstruction are much faster than traditional approaches, but they are generally less accurate. Most of these methods calculate pairwise distances for a set of input sequences, for example from word frequencies or from so-called spaced-word matches. In this paper, we propose the first word-based approachto tree reconstruction that is based ...
متن کاملQDist-quartet distance between evolutionary trees
SUMMARY QDist is a program for computing the quartet distance between two unrooted trees, i.e. the number of quartet topology differences between the trees, where a quartet topology is the topological subtree induced by four species. The program is based on an algorithm with running time O(n log2 n), which makes it practical to compare large trees. Available under GNU license. AVAILABILITY ht...
متن کاملEstimating Species Trees from Quartet Gene Tree Distributions under the Coalescent Model
In this article we propose a new method, which we name ‘quartet neighbor joining’, or ‘quartet-NJ’, to infer an unrooted species tree on a given set of taxa T from empirical distributions of unrooted quartet gene trees on all four-taxon subsets of T . In particular, quartet-NJ can be used to estimate a species tree on T from distributions of gene trees on T . The quartet-NJ algorithm is concept...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Molecular phylogenetics and evolution
دوره 62 1 شماره
صفحات -
تاریخ انتشار 2012